Search results for "Distributed database"

showing 10 items of 23 documents

Rings for Privacy: an Architecture for Large Scale Privacy-Preserving Data Mining

2021

This article proposes a new architecture for privacy-preserving data mining based on Multi Party Computation (MPC) and secure sums. While traditional MPC approaches rely on a small number of aggregation peers replacing a centralized trusted entity, the current study puts forth a distributed solution that involves all data sources in the aggregation process, with the help of a single server for storing intermediate results. A large-scale scenario is examined and the possibility that data become inaccessible during the aggregation process is considered, a possibility that traditional schemes often neglect. Here, it is explicitly examined, as it might be provoked by intermittent network connec…

020203 distributed computingInformation privacyDistributed databasesDistributed databaseSettore ING-INF/03 - TelecomunicazioniComputer scienceReliability (computer networking)Secure Multi-Party Computation02 engineering and technologycomputer.software_genreSecret sharingData Mining; Data privacy; Distributed databases; Peer-to-peer computing; Secret sharing; Secure Multi-Party ComputationComputational Theory and MathematicsHardware and ArchitectureServerSignal Processing0202 electrical engineering electronic engineering information engineeringSecure multi-party computationData MiningData miningPeer-to-peer computingC-means data mining Privacy secret sharing secure multi-party computationSecret sharingcomputerData privacy

researchProduct

ImageRover: A Content-Based Image Browser for the World Wide Web

1997

ImageRover is a search-by-image-content navigation tool for the World Wide Web (WWW). To gather images expediently, the image collection subsystem utilizes a distributed fleet of WWW robots running on different computers. The image robots gather information about the images they find, computing the appropriate image decompositions and indices, and store this extracted information in vector form for searches based on image content. At search time, users can iteratively guide the search through the selection of relevant examples. Search performance is made efficient through the use of an approximate, optimized k-d tree algorithm. The system employs a novel relevance feedback algorithm that se…

CBIRInformation retrievalDistributed databasebusiness.industryComputer scienceSearch engine indexingRelevance feedbackcomputer.software_genreWorld Wide WebInformation extractionTree (data structure)RobotThe InternetbusinessImage retrievalcomputer

researchProduct

Streamlining distributed Deep Learning I/O with ad hoc file systems

2021

With evolving techniques to parallelize Deep Learning (DL) and the growing amount of training data and model complexity, High-Performance Computing (HPC) has become increasingly important for machine learning engineers. Although many compute clusters already use learning accelerators or GPUs, HPC storage systems are not suitable for the I/O requirements of DL workflows. Therefore, users typically copy the whole training data to the worker nodes or distribute partitions. Because DL depends on randomized input data, prior work stated that partitioning impacts DL accuracy. Their solutions focused mainly on training I/O performance on a high-speed network but did not cover the data stage-in pro…

Data setWorkflowDistributed databaseProcess (engineering)Computer sciencebusiness.industryDeep learningDistributed computingComputer data storageData deduplicationArtificial intelligenceGlobal Namespacebusiness2021 IEEE International Conference on Cluster Computing (CLUSTER)

researchProduct

Collaborative Assessment of Information Provider's Reliability and Expertise Using Subjective Logic

2011

QA each user can individually estimate the expertise and the reliability of her peers using her direct interactions with them and our framework. The online SN (OSN), which can be considered as a distributed database, performs continuous data aggregation for users expertise and reliability assessment in order to reach a consensus. We emulate a Q&A SN to examine various performance aspects of our algorithm (e.g., convergence time, responsiveness etc.). Our evaluations indicate that it can accurately assess the reliability and the expertise of a user with a small number of samples and can successfully react to the latter's behavior change, provided that the cognitive traits hold in practice.

Distributed databaseComputer scienceBehavior changeComputerApplications_COMPUTERSINOTHERSYSTEMSCognitioncomputer.software_genreInformation providersOrder (business)Human–computer interactionConvergence (routing)Data miningSubjective logiccomputerReliability (statistics)Proceedings of the 7th International Conference on Collaborative Computing: Networking, Applications and Worksharing

researchProduct

Object Clustering Methods and a Query Decomposition Strategy for Distributed Object-Based Information Systems

1999

Emerging developments and advances in distributed processing have created a need for tools and methods to partition and distribute information systems across interconnected processors. In particular, distribution approaches which take into account the key characteristics of OO concepts are required to extend traditional fragmentation results to object oriented database systems. To fulfill the above requirements, we propose a methodology for the distribution design of object-based information systems. The underlying approach consists of techniques and heuristics that can be used to create clusters of inter-related object classes that can be fragmented interdependently, producing distribution…

Distributed databaseComputer scienceDistributed computingConceptual graphInformation systemInformation processingCluster analysisHeuristicsPartition (database)

researchProduct

Accelerating data queries on Hadoop framework by using compact data formats

2016

There are massive amounts of data generated from IoT, online transactions, click streams, emails, logs, posts, social networking interactions, sensors, mobile phones and their applications etc. The question is where and how to store these data in order to provide faster data access. Understanding and handling Big Data is a big challenge. The research direction in Big Data projects using Hadoop Technology, MapReduce kind of framework and compact data formats such as RCFile, SequenceFile, ORC, Avro, Parquet shows that only two data formats (Avro and Parquet) support schema evolution and compression in order to utilize less storage space. In this paper, file formats like Avro and Parquet are c…

Distributed databaseDatabasePlain textComputer sciencebusiness.industryBig datacomputer.file_formatcomputer.software_genreFile formatColumn (database)Schema evolutionData accessBinary databusinesscomputer2016 IEEE 4th Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE)

researchProduct

Infiniviz: Taking Quake 3 Arena on a Large-Scale Display System to the Next Level

2018

The authors of this paper have previously presented a large-scale display system called Infiniviz in other publications. Infiniviz attempts to improve network bandwidth consumption and computational performance compared to other existing large-scale display systems. Since the previous publications have been made in early development stages of Infiniviz, only the overview of the software architecture and details of hardware implementation have been presented so far. This paper contains a real-life test of Infiniviz running Quake 3 Arena at a resolution of 9600 x 5400 at 24 fps. Also, in this paper, the authors have tried to match their results to what has been published by other researchers …

Distributed databaseQuake (series)video streamingComputer sciencebusiness.industrymonitor wallVisualizationlcsh:Telecommunicationlcsh:TK5101-6720virtual machinegamelarge-scale display systemH.264Software architectureSoftware engineeringbusinessProceedings of the XXth Conference of Open Innovations Association FRUCT

researchProduct

HybridS: A Scheme for Secure Distributed Data Storage in WSNs

2008

In unattended wireless sensor networks (WSNs), data is stored locally or at designated nodes upon sensing, and users can access it on demand. This paradigm can improve energy efficiency by making use of the upcoming cheap and large flash memory, as well as system robustness. Nevertheless, the security and dependability of distributed storage are critical for the applicability of such WSNs. In this paper, we propose a secure and dependable data storage scheme by taking advantages of secret sharing and Reed-Solomon code, which has computational security and yet maintains optimal data size. The extensive analysis verifies our scheme can provide secure and dependable data storage in WSNs in the…

Distributed databasebusiness.industryComputer scienceDistributed computingComputer data storageDistributed data storeDependabilityCryptographybusinessWireless sensor networkByzantine fault toleranceSecret sharingComputer network2008 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing

researchProduct

GPCALMA, a mammographic CAD in a GRID connection

2003

Purpose of this work is the development of an automatic system which could be useful for radiologists in the investigation of breast cancer. A breast neoplasia is often marked by the presence of microcalcifications and massive lesions in the mammogram: hence the need for tools able to recognize such lesions at an early stage. GPCALMA (Grid Platform Computer Assisted Library for MAmmography), a collaboration among italian physicists and radiologists, has built a large distributed database of digitized mammographic images (at this moment about 5500 images corresponding to 1650 patients). This collaboration has developed a CAD (Computer Aided Detection) system which, installed in an integrated…

Engineering drawingMultimediaDistributed databasemedicine.diagnostic_testbepress|Physical Sciences and Mathematics|PhysicsComputer scienceComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONFOS: Physical sciencesCADGeneral Medicinecomputer.software_genreGridPhysics - Medical PhysicsComputer aided detectionBreast cancerGrid connectionmedicineComputer Aided DesignMammographyCADMedical Physics (physics.med-ph)GRIDcomputerBreast cancer; CAD; GRIDDigitization

researchProduct

Distributed medical images analysis on a Grid infrastructure

2007

In this paper medical applications on a Grid infrastructure, the MAGIC-5 Project, are presented and discussed. MAGIC-5 aims at developing Computer Aided Detection (CADe) software for the analysis of medical images on distributed databases by means of GRID Services. The use of automated systems for analyzing medical images improves radiologists’ performance; in addition, it could be of paramount importance in screening programs, due to the huge amount of data to check and the cost of related manpower. The need for acquiring and analyzing data stored in different locations requires the use of Grid Services for the management of distributed computing resources and data. Grid technologies allow…

GRID; Virtual Organization; Medical ApplicationsComputer Networks and CommunicationsComputer scienceVirtual organizationmammographyComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONcomputer.software_genreGRID; virtual organization; CAD; mammography; medical applicationsSoftwareComputer aided diagnosimedicineMammographyCADComputer visionGridLung tumorDistributed databasemedicine.diagnostic_testmedical applicationsbusiness.industryDigital imagingGridDigital imagingHardware and ArchitectureImage analysiArtificial intelligenceData miningAlzheimer diseasevirtual organizationGRIDbusinesscomputerSoftwareMammography

researchProduct